🕷️️ Job Radar • SCRAPING

Job Radar. Live notifications. AI processed.

upwork.com 2026-04-30 🟠

🔹 [Format] Convert official election Form 20 PDFs into a clean, structured Excel dataset
👤 Client: 🇮🇳 India Member since 2026-04-28
💰 Price: $100
🚩 Problem: Inaccurate and unstructured data extraction from multiple PDF files leading to inefficient analysis.
📦 Existing: Not specified

Specifications:

[Target] Extract tabular data (candidates, votes, totals) accurately from 243 PDFs
[Method] Use OCR tools for PDF table extraction followed by manual verification and cleaning
[UI/UX] Not applicable
[Stack] Python with libraries like Camelot or Tabula for PDF processing; Pandas for data manipulation; Openpyxl for Excel export
[Security] Ensure data privacy and confidentiality during handling and storage
[Format] Clean, structured Excel dataset with no merged cells, consistent column structure

Workflow:

1. Process 2-3 sample PDFs as a paid test to validate accuracy and quality of output.
2. Develop Python script using OCR tools for initial data extraction from all 243 PDFs.
3. Manually verify extracted data, correct errors, and ensure numeric fields are formatted correctly.
4. Clean up Excel dataset by removing merged cells and ensuring no missing rows.
5. Flag unclear or unreadable entries for further review.

⚡ Receive notifications instantly Join our community.